An Abstract Programming System 



David A. Plaisted* 
Department of Computer Science 
UNC Chapel Hill 
Chapel Hill, NC 27599-3175 
Phone: (919) 967-9238 
Fax: (919) 962-1799 

Email: plaisted@cs.unc.edu 

February 1, 2008 



Abstract 

The system PL permits the translation of abstract proofs of program correctness into pro- 
grams in a variety of programming languages. A programming language satisfying certain 
axioms may be the target of such a translation. The system PL also permits the construction 
and proof of correctness of programs in an abstract programming language, and permits the 
translation of these programs into correct programs in a variety of languages. The abstract pro- 
gramming language has an imperative style of programming with assignment statements and 
side-effects, to allow the efficient generation of code. The abstract programs may be written by 
humans and then translated, avoiding the need to write the same program repeatedly in different 
languages or even the same language. This system uses classical logic, is conceptually simple, 
and permits reasoning about nonterminating programs using Scott-Strachey style denotational 
semantics. 
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1 Introduction 

The purpose of the system PL is to permit the construction of proofs that can be viewed as abstract 
programs and translated into correct programs in a variety of programming languages. The empha- 
sis is not on the automatic construction of the proofs but on the process of translating them into 
programs in specific programming languages. A programming language must satisfy certain condi- 
tions in order to be the target of such a translation, and typical procedural, functional, and logic 
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programming languages satisfy these conditions. This system uses classical logic and Scott-Strachey 
domain theory |Sto77| . This system therefore, in theory, permits reasoning about nonterminating 
programs and nondeterministic programs. The system may also be used to construct abstract pro- 
grams without proving them correct, and these abstract programs can also be translated into a 
variety of programming languages. 

For a given programming language L, the axioms PL(L) describe the properties the language 
L must satisfy in order for PL proofs to be translatable into L. A programming language L is 
PL-feasible if it satisfies the axioms of PL(L). (Actually, the set of PL-feasible languages may 
differ from one program to another, because some programs may require different precisions of 
floating point numbers or various sizes of character strings or integers that may not be available in 
all languages, et cetera.) PL(L) permits the construction of proofs that a particular L program P 
satisfies a specification. The system PL* permits the construction of abstract proofs that correspond 
to PL(L) proofs and therefore to correct programs in any PL- feasible language L. For each PL- 
feasible language L, there is an effective function from proofs in PL* to correct programs in L. 
This permits the translation of PL* proofs into correct programs in any PL-feasible language. An 
abstract programming language L* corresponds to the system PL*, and there are likewise effective 
functions from correct L* programs to correct programs in any PL-feasible language L. It is possible 
to prove the correctness of L* programs using PL* or instead to gain confidence in the reliability 
of L* programs by testing or some other means; thus it is not necessary to prove correctness in 
order to translate L* programs into PL-feasible languages. Note that PL is not concerned with the 
details of the semantics of PL-feasible languages L; it only requires that L satisfy the axioms given 
for PL-feasible languages. 

In order to have a realistic representation of algorithms at an abstract level in PL, it is necessary 
to include imperative features in PL that permit an accurate representation of operations and data 
structures that efficient algorithms use. For example, PL formalizes the destructive modification 
of data, as well as the side effects of operations on data; one cannot realistically describe quicksort 
without the former, and one cannot realistically describe binary tree manipulation routines without 
the latter. The operations in PL are few enough in number to preserve its simplicity, but inclusive 
enough to permit the generation of efficient code through the specification of destructive assignment 
statements and side effects. 

2 Background and Discussion 

There has been substantial work in program generation and programming logics. A number of 
papers discuss program synthesis based on constructive logic, type theory, and the Curry-Howard 
isomorphism Con85, CS93 . For example, Martin-Lof's theory of types |ML82j and the calculus 
of constructions of Coquand and Huet |CH881 lvHURS95] are used for this purpo se. Bittel [Bit92 
and Kanovich |Kan91| describe program synthesis in intuitionistic logic. NuPRL |CAB + 86] uses a 
hybrid system of logic and type theory. The proofs-as-programs paradigm of Bates and Constable 
BC85 interprets constructive proofs as executable programs using the Curry-Howard isomorphism. 
However, such proofs contains both computational content and correctness arguments; in order 
to obtain efficient code, it is useful to separate these, which is not always simple (Berger and 
Schwichtenberg BS96 ). Avellone et al |AFM99| discuss this separation in the context of a system 
for reasoning about abstract data types. They discuss program synthesis from constructive proofs, 
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but in which abstract data types have classical semantics. Jeavons et al JPCBOO present another 
system for separating these parts of constructive proofs using the Curry-Howard isomorphism. Most 
such systems synthesize functional programs, but Mason |Mas97| studies the synthesis of imperative 
programs in a lambda calculus framework. 

Another line of work in program generation makes use of schemata. The idea of schema based 
program synthesis [FLOR99] is to consider a schema as a generalized program that can be instan- 
tiated to a number of specific programs. Such schemata can be derived formally or constructed 
manually, and the correctness proofs can be manual or with automated assistance. Once obtained, 
schemata can be instantiated and combined to produce a variety of correct programs. Also, sys- 
tematic methods exist for transforming schemata into different schemata. Huet and Lang HL78 
studied the transformation of one schema to another, and many other such transformation systems 
(' |Par90| ') have been studied. This can include tail recursion elimination, for example. An advantage 
of the schema-based approach to program generation as compared to constructive derivations of 
programs from scratch is that the deductive and programming tasks are easier. The schema-based 
approach approach also permits the use of classical logic in reasoning about schemata. Of course, 
schema-based development is also possible in constructive logics. Anderson and Basin |AB00j men- 
tion that schemata are not general enough to capture some programming knowledge, including the 
design patters of Gamma et al GH JV95| . 

Schema based program synthesis as in Flener et al FLOR99 is concerned with synthesizing 
a variety of logic programs, possibly in a single language, from a schema. A schema is typically 
open, which means that some of the predicates (representing procedures) are not defined. Thus, the 
schema can instantiate to different programs if different definitions of the undefined procedures are 
given. Flener et al |FLOR99] separate a schema into a template, which is an abstract program, and a 
specification framework, or collection of axioms giving the intended semantics of the problem domain. 
Their work was strongly influenced by the work of Smith [Smi90 in this respect. The semantics 
can either be isoinitial, a restriction of initial algebra semantics in which negative as well as positive 
equations are preserved FL098 , or can be based on logic program semantics using completions of 
a logic program LOT99j. A schema that is steadfast is guaranteed always to instantiate to correct 
programs. The synthesis process combines schemata, often by instantiating the open predicates 
of one schema using predicates from another. Buyukyildiz and Flener |BF97| study rules for the 
transformation of one logic program schema into another. Lau, Ornaghi, and Tarnlund LOT99 
discuss the relationship of schemata to object-oriented programming. Deville and Lau DL94 discuss 
constructive, deductive, and inductive synthesis of logic programs. 

Anderson and Basin jABOO show how to view program schemata as derived rules of inference in 
higher-order logic. This approach can encompass both functional and logic programming languages. 
Like the schema approach, this approach relies on classical logic and does not make initial algebra 
assumptions that are typical of abstract data type theory. The formalism of Anderson and Basin 
ABOO] makes use of schema variables, which can be replaced by arbitrary functions. Also, Anderson 
and Basin jABOOj emphasize logic programs, but remark that schema based development also applies 
readily to functional programs. Shankar Sha96 and Dold |Dol95j study program transformation in 
higher-order logic in the PVS system. 

Manna and Waldinger's deductive tableau system |MW92j also uses classical logic for synthesis 
of functional programs. Ayari and Basin ABUT] show how to express this system in Isabelle us- 
ing higher-order logic and higher-order resolution. They give an example of synthesizing sorting 
programs (including the quicksort program) in a functional language. 
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The common language runtime |Box02| of Microsoft is an attempt to ensure compatibility be- 
tween different programming languages by compiling them all into a common intermediate language. 
This permits programs in different languages to communicate with each other. 

PL has some features in common with the preceding systems. As with schemata, PL is based 
on classical logic, is not based on the Curry-Howard isomorphism, and is only concerned with 
type theory in an incidental way. PL also explicitly separates correctness arguments from the 
computational content of a program. Abstract PL programs are similar to schemata, or templates. 
An abstract program in PL is only partially specified, in the sense that some of the procedures it uses 
may not be defined. This corresponds to the open programs of Flener ct al FLOR99 . Steadfast logic 
programs correspond to a PL program fragment that satisfies a specification. Abstract programs 
can be combined in PL, as the schemata of Flener et al [FLOR99 , by instantiating the undefined 
procedures of one program to the procedures of another and combining the programs. Thus an 
abstract PL program can be viewed as a rule of inference for constructing programs, as in the 
system of Anderson and Basin jA"B00| . The goal of PL is to avoid the need to write the same 
program over and over in different languages by permitting abstract PL programs to translate to 
many languages. Also, PL even avoids the need to write the same program many times in the same 
language, because it permits substitutions for the names of the procedures in an abstract program. 

However, the preceding approaches have a different emphasis than PL. PL does not emphasize 
the underlying logic; any sufficiently expressive logic, such as some version of set theory or higher- 
order logic, would suffice. In PL, a single logical formula mentions both an abstract formula and 
the properties it is assumed to have, rather than separating this information as is often done with 
schemata. In this, our approach is similar to that of Anderson and Basin |AB00j . Furthermore, 
PL semantics is based on Scott-Strachey style |Sto77| denotational semantics, and therefore can 
potentially reason about nonterminating and even nondeterministic computation. PL semantics 
is defined axiomatically, instead of by initial or iso-initial models. Also, the present paper is not 
concerned with program generation methodology per se, as are a number of other works. In PL, 
the emphasis is not on synthesis but on translation of an abstract program into efficient programs 
in a variety of languages. This is related to the research topic mentioned in Anderson and Basin 
ABOO; of developing a metatheory to transfer schema results from one area to another. Some other 
systems handle synthesis in particular languages including UNIX, object code, and logic programs. 
For example, Sanella and Tarlecki |ST89| discuss the formal development of ML programs from 
algebraic specifications. Bhansali and Harandi |BH93| discuss the synthesis of UNIX programs. 
Benini |Ben00j discusses program synthesis of object code. The process of translation of PL programs 
into other languages is automatic and does not require any planning or reasoning. In contrast to 
other systems, PL does not emphasize the transformation of one schema to another, except for the 
translation of schemata into specific languages and the combination of existing schemata. Another 
difference between PL and other approaches is that PL variables can only be replaced by procedure 
names, and not by arbitrary functions as in Anderson and Basin ABOO . 

In addition, PL gives substantial attention to imperative features such as assignment statements 
and side effects that are important for efficient code generation. The treatment of side effects is 
somewhat similar to that of Mason |Mas97| . The example of quicksort from Ayari and Basin [ABOI 
uses a functional notation, in which array segments arc concatenated, instead of the usual, more 
efficient approach of in-place processing of subarrays in the recursive step. Bornat BorOO] gives a 
way to prove properties of pointer programs using Hoare logic. Another approach to side effects is 
given by Harman et al [HHZMOf and involves transforming programs to remove side effects. Though 
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PL emphasizes imperative languages, it also has applications to logic and functional programs. By 
comparision, few if any of the preceding systems emphasize translation into a variety of languages, 
nor do most of them emphasize imperative features of languages. In addition, the focus of many of 
these system is the method of program generation. 

The focus of PL differs from that of the common language runtime, as well. The latter permits 
programs in different languages to communicate. PL permits an abstract program to translate into 
a variety of other languages at the source level, and thereby avoids the need to write the same 
program many times in many languages. 

In general, PL is not so much concerned with how programs are synthesized as with axiomatizing 
their correctness in an abstract setting, so as to guarantee the correctness of their translations into 
specific languages. The user would typically write programs in PL and provide proofs of their 
correctness. 

Current program generation methods have several problems: 1. The demands on the formal 
reasoning part of the process are too stringent. 2. The generated code is not always as efficient as 
possible. 3. The logic is often unfamiliar to the typical user. 

The system PL seeks to overcome these problems by permitting the writing and debugging of 
abstract programs without any reasoning at all, if desired. However, it is possible to verify the 
abstract programs. These programs then translate into efficient code in a variety of other languages. 
This is possible because the abstract programs permit an imperative style of programming with 
assignment statements and side effects. The use of classical logic helps to solve the third problem. 

3 Axioms of Program Language Semantics 
3.1 Introduction to Axioms 

The system PL(L) refers to program fragments in programming languages L. A program fragment P 
of L is a portion of a program in L that specifies the definition of some procedures and data in terms 
of others. Data may be integers, arrays, lists, trees, or other data structures typically referenced by 
program variables. The outputs of P are the procedures and data that are defined in P in terms of 
other procedures and data. The inputs of P are the procedures and data that are referenced in P 
but not defined there. Thus if x are the inputs of P and y are the outputs of P, P defines a function 
from the semantics of x to the semantics of y. x and y are variables of P in the system PL(L). 

There are several operations on program fragments in the system PL(L). If P and Q are program 
fragments, then P; Q represents the sequential composition of P and Q (P then Q). PL(L) does not 
have a parallel composition operator; it would be possible to add additional operators such as parallel 
composition and object inheritance to PL(L). 3xP represents P with the variable x declared "local" 
so that it is not visible outside of 3xP. If is a substitution, then P9 is P with program variables 
(procedures and data) substituted as specified by 9. ju is a least fixpoint operator on programs, 
corresponding to the definition of recursive procedures. t§ P represents P with the new procedure 
p defined; this corresponds to a program of the form procedure p(x); P having P as the procedure 
body. J.j P represents the application of a procedure to arguments. This corresponds to a program 
of the form P; call p(x) where p is a procedure defined in P. The program P may be empty in 
this case. xlPlQ represents the conditional "if x then P else Q" where P and Q are program 
fragments, possibly empty. n(u,v,P,>) represents the "fixed point" of P with procedures u and v 
identified, where > is the "definedness" ordering for the denotational semantics of P. Each program 
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P has a corresponding PL textual syntax P text so that the textual syntax for P; Q is P text ; Q tea;t , 
the textual syntax for 3xP is var x; p text 7 the textual syntax for |^ P is proc p(3?); P tert end p, 
the textual syntax for j^l P is p text - call and the textual syntax for x?P?Q is if a; then p tea;t 

else Q te:Et fi. Also, fi(u, v, P(u, v), >y ext is typically P(v, v) text . In addition to this, of course, an L 
program will have a syntax specified by the language L. 



3.1.1 Side effects 

In a realistic system, one needs to formalize imperative operations on data for efficiency; for example, 
one may have a program that repeatedly updates a database, or repeatedly modifies an array, graph 
or buffer. Creating a new copy of a data structure each time it is modified is inefficient. Typically 
one can assign values repeatedly to a program variable using assignment statements. However, 
procedures typically have only one definition. In order to accommodate this distinction in PL, there 
are both procedure and data variables, and the semantics of a data variable x in a program fragment 
P is an ordered pair (a, (3) where a is the initial value of x (when P begins) and (3 is the final value of 
x (when P ends). By convention, a — x lmt and (3 — x^ m . In PL, a procedure p(x, y) with semantics 
yfm _ x mtt can ex p ress an assignment statement y := x. There are no assignment statements per 
se in PL, and no arithmetic or Boolean operators. PL procedures with an appropriate specification 
represent such statements and operators. In the translation to an L program, such procedures would 
translate to the corresponding assignment statements and operators. 

For some algorithms, such as binary tree manipulation routines, pointer manipulations are nec- 
essary. A proper treatment of pointers requires a modification to the semantics of variables. If a 
variable points to the root of a binary tree, then the semantics of the variable should include the 
whole tree. If a variable points to the root of a LISP list, the semantics of the variable should include 
the entire list. Therefore, the semantics of a variable needs to include all other values that may be 
reached from the variable by a sequence of pointers. This implies that there are side effects. If x 
points to a binary tree T having T" as a subtree, and y points to T' , then a change to a substructure 
of y will change x as well. In general, any change to a pointer will affect any structure containing 
this pointer; this is the kind of side effect that PL can formalize. 

Side effects can also occur if a procedure modifies variables declared outside the procedure body. 
The syntax of PL-feasible languages L prohibits this, to simplify reasoning about L programs. How- 
ever, read and write statements modify input and output files and buffers, and therefore imply side 
effects to variables declared outside a procedure body. Such variables are also called global variables 
for the procedure. To handle this, PL-feasible languages may consider certain state variables such 
as the status of input and output files as implicit parameters of every procedure. A procedure may 
also modify global variables indirectly by side effects. This is difficult to detect syntactically. We 
assume that this cannot happen. There are sufficient conditions to prohibit such side effects, such 
as the condition that no pointer manipulations or array element assignments can precede procedure 
definitions. 

In order to reason about the side effects of one actual parameter on another, we assume that 
parameters are passed by value when possible. For arrays and complex data structures such as 
lists and binary trees, parameters are passed by reference. The semantics of parameters passed by 
reference must include not only their value but also some information about the address at which 
they are stored, in order to determine the side effects of a change of one parameter on another. 
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3.1.2 Functional and Logic Programming Languages 

Because PL permits an imperative style of programming, it may be difficult to encode abstract PL 
programs in pure functional and logic programming languages. However, many functional and logic 
programming languages have imperative features added for efficiency, facilitating the translation of 
PL programs into these languages. Some restrictions on PL programs may facilitate their translation 
into pure functional and logic programming languags. For example, if a PL program is written in 
a single assignment style, in which each data variable is assigned at most once, then the translation 
into functional and logic programming languages appears to be fairly direct. The single assignment 
style of programming also minimizes side effects. A more restrictive class of PL programs arc those 
without any data variables, and these should be even easier to translate into pure functional and 
logic programming languages. 

3.1.3 Substitutions 

There are a number of axioms and rules of inference about substitutions in PL. These are necessary 
in order to reason about specific instances of general programs. Suppose that P(x,y) and Q(u,v) 
are program fragments in PL{L). Suppose one has assertions A(x,y) and B(u,v) expressing the 
properties of P and Q. The program fragment P(x,y);Q(y,v) expresses a sequential composition 
of P and Q, with the variable y of Q replacing the variable u. It is desirable to reason about 
the properties of this combined program fragment. It is plausible to assume that the assertion 
A(x, y) A B(y,v) would hold for the program fragment P(x,y);Q(y,v). However, deriving this 
assumption requires axioms about how the assertions A and B behave under substitutions to the 
programs P and Q. Therefore PL contains a number of axioms about substitutions and their 
influence on program semantics. These axioms enable the derivation of properties of substitution 
instances of a general program from properties of the general program, and therefore facilitate 
the construction of programs in PL from general building blocks. PL restricts such reasoning to 
substitutions that do not identify output variables, because this assumption simplifies the axioms. 

As an example where identifying output variables leads to unusual behavior of instances of a 
general program, consider the program P(x,y) equal to x := x + 1; y := y + 1 and the assertion 
A(x, y) = (x fin = x mlt + 1 A y Sin = y mlt + 1). The program P(x, x) is then x := x + 1; x := x + 1 
and no longer satisfies the assertion A(x, x) = (x fm — x mlt + 1 A x fm = x mlt + 1). Instead, P(x, x) 
satisfies the assertion x fm = x mtt + 2. 

Because PL is a general system for reasoning about programs in various languages L, it is 
necessary for PL formulas to refer to programs in L. PL views L programs simply as strings in a 
language, with certain program variables (names of procedures and data variables) replaced by PL 
variables in order to reason about instances of general programs. 

3.2 Terminology 

L is a programming language and P and Q denote programs or fragments of programs in L. These 
are sometimes written as P L and Q L to specify L. Programs in L are assumed to satisfy the axioms 
of the system PL(L) given below. 

In the notation \x]P(y), P is a program fragment containing variables y that may represent 
procedures or data. Variables may appear more than once in x and y. The variables y are the 
"schema variables" of program P and x is a listing of these variables in the order they will appear 
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in assertions about P. The free variables FV(P) of a program P are those procedures and variables 
of P that are not locally bound in P, so that they are available outside of P. Other variables of P 
are bound. By convention all free schema variables in P must appear in x. Variables may appear 
in x that do not appear free in P. Such variables arc also elements of FV(P), by convention. P(x) 
may be an abbreviation for \x]P(x). The side effect variable ip is included as the last element in the 
list x even though ip does not occur in P; this variable ip is useful for reasoning about side effects. 
The variable ip is a data variable, and the sort of ip is the union of the sorts of all data variables. 
The semantics of ip consists of pairs of the form (a, (3) indicating that if a is the value of ip at the 
beginning of the execution of P, then side effects of P cause (3 to be the value of ip at the end of the 
execution of P. For example, if P changes a pointer at address a to point to 6, then the semantics of 
the side effect variable ip for P would consist of pairs (a;i, £2) such that X2 = x\ if x\ is a structure 
not containing the pointer a, and if x\ does contain the pointer a, then x 2 would be x\ with this 
pointer modified to point to b. 

The side effect variable interacts with the sequential composition operator. The composition P; Q 
of two program fragments has the precondition that FV(P) — FV(Q). This precondition enables 
reasoning about side effects. For programs without side effects, this condition can be relaxed. 
Suppose P(x) has only the free data variable x and Q(y) has only the free data variable y. In order 
to compose P and Q, it is necessary to add y as a free variable to P and x as a free variable to Q. 
The new variable rule permits this, but requires the semantics of these new variables to reflect the 
side effects of the executions of P and Q. Thus in P(x), the new variable y has semantics reflecting 
the side effects of the execution of P on y. Similarly, in Q(y), the new variable x has semantics 
reflecting the side effects of the execution of Q on x. Therefore in the program fragment P(x); Q(y), 
the overall semantics of x would reflect the effect of executing P, followed by the side effects of the 
execution of Q on x, and the semantis of y would reflect the side effects of executing P, followed by 
the effect of executing Q. 

P rn denotes the set of input variables of program P and P out denotes the output variables. Input 
variables of P are those that are free in P but not defined there. Output variables of P are variables 
that are defined in P and may or may not be used in P. Each variable may be data, which must be 
defined before it is used, or a procedure, which can be defined after it is used. No procedure variable 
may be defined twice. P proc denotes the free variables of P that are procedures and P data denotes 
those that are data. If p G pw° c then p takes zero or more arguments, which need not listed be 
among the variables of P and may either be procedures or data, and which may be inputs or outputs 
of p. The number of arguments of p is its arity. The notations p m , p out , pP roc , and p data ; each of 
which denotes a subset of {1, . . . , n} if n is the arity of p, indicate which arguments of p are inputs, 
outputs, procedures, and data. Similarly, p l ' m et cetera give information about the i th argument of 
p, and p % >3' in et cetera give information about the j th argument of the i th argument of p, if p is a 
procedure. In general, one writes p a > m et cetera where a is a sequence of integers. To avoid dealing 
with sets of integers, one writes p(x) m , defined as {xi : i e p m }, et cetera. Let x Type be the type of 
x, which we define as the function from a to the 4-tuple ( x a > m : x a ' out ,x a ' data ,x a ' proc ), for integer 
sequences a. Let x Q < T ^ e be the function from f3 to the 4-tuple ( x af3 ' in ,x a P< oui ■, x al3 ' data ■,x af3 ' proc ), 
for integer sequences a and [3. Thus X a - Tvpe is the type of x a , in a sense. Also, xp or x[P] denotes 
the variable x of the program P. 

The notation {z : [a;]P(y)} means that z is a sequence of values representing possible semantics 
of the schema variables x that appear in P, Zi being the semantics of Xi, and y is the variables x 
listed in a possibly different order. Note that z is not a function of P, because P may be just a 
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fragment of a larger program P' , and some of the program variables in P may be procedures that 
are defined elsewhere in P'. The constraint C^ p represents the constraint on the semantics of x 
imposed by the program fragment P L . Thus C^ p (y) denotes that y are possible semantics for 

the schema variables x that are consistent with the program fragment P L . This is simply another 
notation for {y : \x]P L }. 

If x and y are variables or terms, x = y means that x and y are syntactically identical. If 
x and y are sequences of variables, then x oy denotes the concatenation of these two sequences. 
Often this is written with a comma as x, y. If x is the sequence x\, . . . ,x n of variables then {x} 
denotes the set {x\, . . . ,x n }. The notation x — > y indicates that = yj if xt = Xj. A variable 
substitution is a function from program variables to program variables, often indicated by 6. If 
P is a program fragment and 9 is a variable substitution, then P9 denotes P with free variables 
replaced as specified by O, and bound variables renamed to avoid captures. A variable substitution 
9 is output-injective on P if for all distinct variables x,y e P out , x9 ^ yQ. If 9 is output injective 
on P then PQ out = P out Q and P9 m = P m @ - P0 out . Thus a variable that is the image of both 
an input and an output variable, is an output variable. A variable substitution may only identify 
variables of the same type, both of which are either procedure variables or data variables. 

The symbol 9 typically denotes a variable substitution and a typically denotes a function from 
integers to integers. If 9 is one to one it is called a variable renaming and if a is also one to 
one it is called an integer -permutation. If x is a tuple of program variables and 9 is a variable 
substitution then xO denotes Xi9, . . . ,x n Q. If a is an integer function and x is any tuple, then 
ct(x) denotes aV(i) , • ■ ■ , x a ( n -\ • Also, if a is an integer function and a; is a tuple of program variables 
then a(x) denotes the variable substitution such that a(x) = xa(x), that is, the substitution {x\ — ► 
x a (\\ , . . . , x n — ► x a (n) }. The side effect variable "0 is always the last element of the list x of variables, 
which means that no variable substitution or integer function can change this property. For example, 
for all variable substitutions 9, ^9 = tp. 

The symbols / and g typically denote functions from variables to their semantics. If / is a 
function on free variables of P then f(x) denotes (f(xi), . . . , f(x n )). 

The formula means A[x] A A[y] D x = y, that is, A is exclusive for x. 

The axioms of PL(L) are as follows: 

3.3 Definitions 

Definition of constraint C in terms of colon notation 

CL ]P (y) = {y:[x]P L } (1) 

Definition of R on programs If R is a relation on semantics of program variables then 

R L ([x]P) = Vy({y : [x]P L ] D R(y)) (2) 

rL,4> CX p UC itly considers the side effect variable ip. R L ^ does not. If neither superscript appears, 
cither meaning is possible. 
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3.4 Axioms about P in and P out 

The general idea is that if a variable is an input variable in one part of a program and an output 
variable elsewhere, it is an output variable for the whole program. 
If 6 is an output injective variable substitution then 

(p(x)ey n = p(x) m e - p(x) out e (3) 

If O is an output injective variable substitution then 

(P(x)6) out = P(x) out e (4) 

(P(x); Q{y)) m = P(x) m U Q(y) m - (P(3f) OM * U 0(f) "*) (5) 

{P{x):Q{y)) out = P(x)° ut U Q(y) out (6) 

(ulP(x)lQ{y)) m = P(x) m U Q(y) m U {u} - (P(3f) OM * U Q(y) out ) (7) 

(u?P(x)?Q(y)) out = P(x)° ut U Q(y) OM * (8) 

T|P(5,») in = i > (S,») in -M (9) 

T| P(af, y) out = p&, v) out u M - M (io) 

i|P(3f,p) in - P(3f,p) in Up(y)»- 4P(x,p) m< (11) 

|| P(3f, p) out = P(3f, p)™' U p(F) OM * (12) 

3wP m = P in (13) 

3wP out = P out - {w} (14) 

A»(u,«,P,>) in = P m - {u} (15) 

li{u,v,P,>) out = P out (16) 

3.5 Axioms about free variables 

FV(P) = P m U P out (17) 



10 



3.5.1 Consequences of this axiom 

FV(P; Q) = FV(P) U FV{Q) 

FV(x?P?Q) = FV{P) U FV(Q) U {x} 
FV(^P(x,y)) = {p}u{x} 
FV(l»P(x,p)) = {x}U{y}\J{p} 
FV{3zP) = FV{P) - {z} 
FV(p(u, v, P(x, u, v), >)) = M U {v} 

3.6 Axioms about arguments to procedure variables 

If x G FV(P) then 



If x G FV(Q) then 



If x G FV{] P - P) - {p} then 



Ifxe FV(\t P) - {y} then 
If x G FV{3yP) - {y} then 



x Type [P;Q] = x Type [P] 
x Type [P;Q]=x Type [Q] 

x Tvpe [] p - p] = x T y pe [p] 

yf' Type [lt P]=p ia > Ty ^[P] 



x Type [3yP] = x Type [P] 
If z G FV(ji(u, v, P, >)(x, y, v)) - {y} then 

z Type [^u,v,P,>)(x,y,v)]=z Type [P] 

For any substitution Q, 

xQ T VP e^ pe ^ = x Ty P e^ 

and 

pQproc _ pprocQ 

and 

pQdata pdataQ 
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3.6.1 Consequences of the above 

p m (yWyP] = P in n{y} (35) 
ffitj^r'ni (36) 
3.7 Axioms about program forming operations 

Preconditions for the sequential composition operator The operation P; Q is allowed if no 
procedure variable x is in P out P\Q out and if P data = Q data . The latter condition can be satisfied by 
adding extra variables to P and Q if necessary using the new variable axiom that appears below. 

Definition of <\> operator The definition of the sequential composition operator makes use of the 
<f> operator, defined as follows: If / and g are semantic functions and P and Q are program fragments 
then the semantic function </>p,q(/, g) satisfies the following: 

1. if z G P proc then <t>P tQ tf,g){z) = f(z). 

2. if z G Q proc - P proc then <t> P , Q (f, g){z) = g(z). 

3. if z G P data n Q data and f(z) = {a, (3) and g{z) = (J3, 7) then <j> P , Q (f, g)(z) = (a, 7). 
If / and g give a semantics for P and Q, then <j>(f, g) gives a semantics for P; Q. 

Preconditions for <f>(f, g) operator prec(4>, P, Q, f, g) specifies 

1. if z G P proc n Q proc then f(z) = g{z). 

2. if z G P data n Q dQta then 3a(3j(f(z) = {a, (3) A g{z) = (/3, 7)). 
Sequential composition axiom 

{h(xoy) : [x,y]P(x);Q(y)} = 3fg({J(x) : [x]P(x)}A{5(y) : [y]Q(y)})^h = <j>P, Q (f,g)Aprec((j),P,Q,f,g). 

(37) 

Conditional axiom 

{w,u, v : [z,x,y]z?P(x)?Q(y)} = (w = true A {u : [x]P(x)}) V (w = false A {v : [y]Q(y)}) 

ifx = FV(P) Ay = FV{Q) (38) 

Here P or Q may be empty. 
Deleting output axiom Intuitively, this axiom declares z to be a local variable. 

Wy'(3z'{x',z',y' : [x,z,y]P] = {x',y' : [x,y]3zP}) (39) 
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Preconditions for procedure operator axiom The operation ^P(x,y) is allowed if no vari- 
able Xi is in p out 7 and if no execution of p can have side effects on the variables x. That is, 
R L (P(x,y)) where R(u,v) = Vi«"* = u{ m ). 

Procedure operator axioms Intuitively, these axioms define a new procedure p having the 
formal parameters y. 

Vu3q{u,q:[x,p]^P(x,y)} (40) 

Vpqu({u,q : [x,p\ l*P(x,y)} D Vv{{u,v : [x,y]P(x,y)} = q(v))) (41) 
where p is a new variable or an input variable of P. 

Preconditions for application axiom The operation j£ P is allowed if P; J!l is allowed. The 
fragment P may be empty, in which case there are no preconditions and J,^ P is equivalent to J,^.. 

Application axioms Intuitively, this operator calls a procedure p with actual parameters y. 

Vuvq({u,v,q : [x,y,p] j|P(z,p)} = {u,v,q : [x, y,p]{P(x,p); ||)}) (42) 

Vvq({v,q:[y,p}i^} = q(v)) (43) 

Least fixpoint axiom If 

Vx'u'T,y'v'{x' , u', y, v' : P(x,u,y,v)} 

and y, v are outputs and x, u are inputs of P and u, v are procedure variables then 

{x',y',w' : n(u,v,P, >)(x,y,v)} = 

{{x' ,w' ,y' ,w' : P(x,v,y,v)} A V*'Vy"({a;', z',y", z' : P(x,u,y,v)} D z' > w')) (44) 

3.8 Axioms about variables 

Correspondence axiom This axiom implies that semantics for procedure variables not appearing 
free in P can be arbitrary. However, data variables, even not free in P, can be influenced by side 
effects of the execution of P. Note that (\x]P) data may include data variables in x that are not free 
in P. 

If {{ Ui ,Xi) : Xi G FV(P) U ([x}P) data } = {{ Vj , yj ) : y 3 G FV(P) U {[y}P) data } then 

{u : [x]P} D {v : [y]P}. (45) 

Alternate version If Vu G FV(P) U ([x]P) data U ([y]P) data .f{u) = g{u) then 

{J(x) : [x]P} D {g(y) : {y]P}. (46) 
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New variable axiom If y is a data variable that does not appear free in P or in x then 

{u, v, w : [x, y, tp]P} = {u, v : \x, ip]P} A {u, w : [x, ip]P} (47) 



Variable renaming axiom If 9 is a variable renaming then 



{V ■ [x]P(x)} D {y : lxQ}P(x<d)} 



(48) 



Note that variable renamings are variable substitutions and are output injective. 



Special case If a is an integer permutation then 



{y : [x]P(x)} D {y : [w(x)]P(a(x))} 



(49) 



Equality axiom 



({y : [x]P} A Xi = x ) Dj/,= y 



(50) 



Definitional independence axiom The idea of this axiom is that any semantics for an instance 
of a program must also satisfy the constraints of the general program, that is, the definitions are 
independent of the instance of the program. 

If x — > y and P(y) does not identify distinct outputs or data variables of P(x) then 



Alternative version If 9 is a variable substitution that is output-injective on P and does not 
identify two data variables then 



Examples of definitional independence Let P(x, y, u, v) be the program y := x+l;v := u+ 
1. Consider the program P(x, y, y, v) which is y := x+1; v := y+1. Then {((1, 1), (0, 2), (0, 2), (0, 3)) : 
P(x,y,y,v)}. Definitional independence asserts that {((1, 1), (0, 2), (0, 2), (0, 3)) : P(x,y,u,v)} 
which is not correct because u is not modified in P(x, y, u, v). The problem is that two data variables 
have been identified. However, there is a semantics for P(x,y,u,v) in which the final values of the 
variables (x,y,u,v) are (1,2,2,3), respectively. 

Definitional independence applies to recursive procedure definitions. For this example, assume 
that L has function procedures. Let P(f, g, h, k) define f(x) as "if x = then else g{x) + k(h(x))." 
Then P(f, g, h, /) defines f(x) as "if x — then else g(x)+f(h(x))." Any semantics for P(f,g, h, /) 
is also a semantics for P(f, g, h, k). 

3.9 Consequences of the above axioms 
Permutation axiom If a is an integer permutation then 



Vz({z : [y]P(y)} D {z : \x]P(x)}) 



(51) 



Vz({z : [xe]P(xe)} D {z : [x]P(x)}) 



(52) 



{y : [x]P} D {a(y) : [a(x)}P} 



(53) 
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Variable independence axiom If Xi is a procedure variable that does not appear free in P then 
{y : [x]P} does not depend on y^ so 

{y : [x]P} D V yi {y ■■ MP}- (54) 

4 Definitional Independence and Fixed Points 

In order to justify the reasonableness of the preceding axioms, it is possible to show that programming 
languages with certain properties satisfy definitional independence and the fixpoint axiom. For the 
former, suppose that y and z are procedure variables of P and W are data variables. Let u> ml * be 
(w\ mt , . . . , w™ 4 *) and let w fm be (w{ m , . . . ,w( l m ). Writing the assertion {(u,v,x) : P(y,J,w)} as 
{[u,3; mlt — > v, W m ] : P[y,w mlt — > z,u^ m ]} indicates that y are the input variables of P, z are the 
output variables, u are possible semantics of y, v are possible semantics of z, and x are possible 
semantics of W. For simplicity, w is ignored from now on, because it does not affect the argument. 

Definition 4.1 A programming language L is denotational if for all programs P in L there is 
a monotonia and continuous functional Tp such that {[u — » v] : P[y — > z]} j^f w = Tp(u) and 
(y,z) —> (%v). (Here it is assumed that y and z are disjoint.) Also, if is an output injective 
variable substitution, then {\u' — » if] : P[yQ — » ^0]} *ff is i/ie minimal element of the set 

{(u",v") : w" = r P (u") and (y6,^6) -> and < = < i/y<0 ^z8}. 

The idea is that if identifies input and output variables of P, then these variables are considered 
as output variables and they need to be minimized subject to the equation v" = Tp(u"). The 
condition (y,z) — > (u, v) means that if two elements yi and yj are the same, their semantics must 
be the same, and similarly for elements of z and for common elements of y and z. 

This condition is reasonable; it states that any inputs to a partial program have definitions 
outside the partial program, so nothing can be assumed about their semantics. But any procedure 
that is an output of the partial program has a definition in the partial program, and therefore has the 
denotationally smallest semantics that satisfies the definition. For example, in the partial program 
P(x, y —> z,w), the definitions of x and y occur outside of P but the definitions of z and w occur in 
P, so the semantics of z and w are constrained by the semantics of x and y and the definitions of z 
and w in terms of x and y. Now consider P(x, y — > y,w). This is an instance of P(x, y — * z, w). The 
procedure y now has a recursive definition, and receives the least possible semantics satisfying its 
definition. The definition of the procedure x occurs elsewhere, so that the semantics of x is arbitrary. 
But the semantics of x determines the semantics of y and w. 

Theorem 1 If L is denotational then every program P in L satisfies definitional independence. 

Proof: Suppose {\u' — > if] : P[y0 — ► ^0]}- First assume that y<d and z0 are disjoint. This means 
that no element of y<3 is in z0. Since L is denotational, it must be that (y0, z0) — ► (u',lf) and 
v' = Tpiu') by definition l4.il Since (y,z) — ► (y0,z0) and (y0,z0) — > (u',v r ), (y,z) — * (u' ,v') 
as well. Because v' — rp(u') and (y, ~z) — * (u' ,v r ), {(u' ,v') : P(y,~z)} also. 

Now consider the case when y<d and z© are not disjoint. Write (y0,z0) as (y'Q,x,x, ~z'Q), 
indicating by x the parts of y and ~z that are identified by and by y and ~z' the remaining 
parts of y and z. Similarly, write (u',v') as (u", w, w, v"). The idea of the definition is that w 
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and v are chosen to be as small as possible subject to the condition that (w,v") = Tp(u",w), 
but u is chosen to be equal to the corresponding components of u', which are not constrained. 

Since L is denotational and {\u' — > v'] : P\y<d — * ^0]}, (u' ,v') is the minimal element of the 
set {{a, (3) : (3 = Tp(a) and (yO,z~<d) — > (a, (3) and Q!j = zt- if j/j0 z0}. 

From this it follows that v' = Tp(u') and (y0,z0) — > (u',v f ). 

We need to show that {[it' — » w'] : P[y — > z]}, that is, (v! ,v') is the minimal element of the set 
{(a,/?) : /3 = Tp(a) and (y,~z) — > (a, (3) and = for all i}. 

First, u' = rp(u') as noted above. 

Second, (y, z) — » (u',17') because (y, z) — > (y0, z0) for any 6 and (as noted above) (y0, z0) — > 

Finally, we need to show that if (3 = Tp(a) and (y, z) — > (a,/3) and ctj = for all z then 
a > m' and /3 > iJ 7 . But if ctj = for all i then a = u' . Thus (3 = Tp(a) = Tp(u') = v' . Thus 
a = v? and (3 = v' . 

As an example where the theorem fails if O identifies two data variables, consider the program 
[x, y]x := x + 1 and its instance [x, x\x :— x + 1. The latter has the semantics ((0, 1), (0, 1)) but not 
the former, because the value of y may not change. 

Because functional, logic, and procedural languages are denotational, with reasonable defini- 
tions of their semantics, it is reasonable to assume that all these languages also satisfy definitional 
independence, and that all the inference rules in PL apply to all such languages. 

In practice, one may use PL without a formal proof that the languages L satisfy definitional 
independence, to obtain programs that may have added reliability even if there is no formal proof 
of correctness. 

The denotational property also suffices to justify the least fixpoint axiom. 

Theorem 2 Suppose L is denotational. Then the least fixpoint axiom is satisfied if one lets n(u, v, P, > 
) be PQ where maps u to v but leaves all other variables unchanged. 

Proof: We show the least fixpoint axiom, axiom which is the following: If 

Vx u Tiy v {x , U , y, V : P(x, u,y, v)} 

and y, v are outputs and x, u are inputs of P then 

{^ , ,y , ,w' : fi(u,v,P,>)(x,y,v)} = 

({x,w',y',w' : P(x,v,y,v)} A V z'Vy" \{x' , z' \y" , z : P(x,u,y,v)} D z > it/)) (55) 

The hypothesis Vx'u'^y v'{x ,u',y, v' : P(x,u,y,v)} is satisfied because L is denotational. 
Suppose {x',y',w' : (J,(u, v, P, >)(x, y, v)}. Defining fi as in the theorem, and using the 
correspondence axiom, this is equivalent to {x' ,w' ,y' ,w' : P(x,v,y,v)}. For the remain- 
ing part, write {x 1 , w', y' , w' : P(x,v,y,v)} as {\x',w' — > y',w'] : P\x,v — > y, v}} and write 
{x',z',y",z' : P(x,u,y,v)} as {[x',z' — > y" 7 z'] : P\x, u — > y, v]}. Because L is denota- 
tional, {\x',z' — > y" , z'\ : P[x,u — > y, v]} iff (x' . z', y" , z') is the minimal element of the set 
{(x",z",y"',z") : {x",z") = T P {y'",z") and ((x,u), (y,v)) - ((x", z"), (y"',z")) and x" = 
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x' and z" = z' (because x and y are assumed disjoint). The condition ((x,u), (y,v)) — > 
((x", z"), (y , z")) is true because 3;, u,y, and w are pairwise disjoint. Thus the only constraint 
on (x 1 , z',y", z') is that (x', z') — Tp(y" , z'). However, because 9 identifies u and v, the corre- 
sponding constraint on (x', w',y) is that w' should be minimal satisfying (x , w') = rp(y , w'). 
Therefore z' > w' as specified above. The other direction follows by similar reasoning, because 
L is denotational. 



5 Inference Rules 

Section 13.71 contains axioms for program language semantics expressed in terms of the operator 
:. These axioms lead to relational inference rules for deriving assertions R L (P) where P is an L 
program and R is a relation on semantics of program variables, using the definition 



If R is such a relation and a is an integer function then Ra denotes the relation such that 
Ro-{y) ee R{a{y)), that is, Ra(y 1 , . . . , y n ) iff R(y a (x) , ■ ■ -,ya{n))- 

The system relational PL(L) consists of the following inference rules, which are consequences of 
the axioms given in section 13.71 



R L ([x}P) = Vy({y : [x]P} D R(y)) 



R[{P) 1 R 1 ^R 2 
Bk{P) 



Underlying logic rule 



Rk{P),k arbitrary 
VkR L k (P) 



Universal quantification rule 



Rules about variables 




Variable renaming rule 



R L ([xQ]P{xQ)) 



R 



([x]P(x)), a is an integer permutation 



Permutation rule 1 



Ra L ^([a-\x)]P(x)) 



R L ([x]P(x)),a is an integer permutation 



Permutation rule 2 



Ra L ([x}P(a(x))) 



R L (P{x)), Xj ee x h Ri{y) ee (R(y) A y t = Vj ) 
Ri(P(x)) 



Equality rule 
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R L (\x]P(x)),(7 is an integer function such that a does not identify data variables of x _ , 

u J f Substitution rule 

Ra L ([x]P(a(x)) 

RHmnvf^ROmAVu e fv {P ) u m p) d ^m = 9 m 2 Rl m) CoriespondeiLCeiule 

R L A[y]P) 

R L (\x]P(x)), 6 output injective on P and does not identify data variables of P ^ „ . . , . , 

f Definitional independence rule 

R L ([x<d]P(xG)) 

R L '^L(\x, il>]P), y a new data variable, R(u, v) A R(u, w) D Ri(u, v, w) . , , 
j—. New variable rule 

R^([x,y,^P) 

Rules about program operations 

Rimy)), gjWf)), precw, p, q, f, g),m(Ri(fm a B*m)) g R (M7,g)(y ° m Co it . rule 

i? L ([y,z]P 1 (y);P 2 (z)) 



Conditional rule 



Pf (Pi(y)),i^(P 2 (7)),y C FT/(Px),z C FF(P 2 ), 
Vx'y'z'(x' A Pi(y')) V A R 2 (z')) D R(x' ,y' ,z r ) 
R L ([x,y,z]x?P 1 (y)7P 2 (z)) 

R L ([u,xMP)fiJu,v) = 3xR(u,x,v) Qutput dclctk)n mlc 



Rf([u,v]3xP) 

R L (P(x,y)),R 1 (u,q)=yv(g(v) D R(%v)) 
W3q Rl (u,q)AR^([x,p}^P(x,y)) 

R L (P(x,p):ll),P non-empty 



Procedure rule 



R L {[LP{x,p)) 



Application rule 1 



q(v) D R(v,q) Application rme 2 

Wu'T$v'R@,u\t,v'),R L (P%u,y,v)),x,u G ^ m ,f,« G f out 
Wu'y'v'(R(x',u',y',v') D 3yV'{z'X,y"y : P(x,u,y,v)}) 

v^r^(^i(^,rX) = («(^X,rX) a v^W'^.^^'^O^/^l «;'))) j 

t = = Least nxpomt rule 

P x (fj,(u,v,P(x,u,y, v), > L )) 

The second line of the hypothesis states that R does not hold on "bad inputs" to P, that is, inputs 
for which there is no output. The ordering >l depends on L and expresses the effect of recursion 
in L. Usually > abbreviates >l- 
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Theorem 3 . These inference rules are logical consequences of the axioms for program language 
semantics which appear in section\3. 7| 



Proof: The proofs for each rule follow: 

Underlying logic rule Suppose R± (P) and Vy(Ri (y) D R2 (y) ) ■ By the definition of Ri(P), 
axiomEl Vy({y : [x]P} D Since Vy{Ri(y) D R 2 (y)), Vy{{y : [x]P} D R 2 (y))- Again by 

axiom P2 (P). 

Universal quantification rule Suppose R^(P) for arbitrary k. By the definition of i?^, 
axiom[3 Vy({y : [x]P} D R^ili))- Because k is arbitrary Vfc(Vy({y : [x]P} D R%(y))). Because 
k does not appear in the antecedent, Vy({y : \x]P} D VkR^(y)). By axiom[21 V/ci?^(P). 

Variable renaming rule Suppose R L (\x]P(x)) and O is a variable renaming. Suppose also 
that {y : \xQ]P(x<d)}. By the variable renaming axiom, {y : [u]P(u)} D {y : [u6 _1 ]P(u9 -1 )} 
because O -1 is also a variable renaming. Letting u be xQ, from the assumption {y : \xQ]P(xQ)} 
it follows that {y : [x]P(x)}. Since R L {[x}P(x)), R(y). Therefore {y : [x6]P(s9)} implies 
R(y). By the definition of R L , R L ([x6]P(x6)). 

Permutation rule 1 Suppose R L (\x]P(x)) and a is an integer permutation. Suppose also 
that {z : \a~ 1 (x)]P(x)}. By the permutation axiom, {ct(z) : \x]P(x)}. Since R L (\x]P(xj), 
R{a(z)) holds, or, Ra(z). Thus {z : [a~ x {x)]P{x)} implies Ra(z). Therefore Ra L ([a- 1 (x)]P(x)) 

Permutation rule 2 Suppose R L (\x]P(x)) and a is an integer permutation. By per- 
mutation rule 1, Ra L (\a~ 1 (x)]P(x)). By the special case of the variable renaming axiom, 
Ra L {[x]P(a(x)))- 

Equality rule Suppose R L (P(x)), Xi = Xj, and Ri{y) = {R{y) A yi = yj). Recall that 
P(x) abbreviates \x]P(x). Suppose also that {y : [x]P}. Since R L (P(x)), R(y) holds. By 
the equality axiom, ({y : \x]P} A Xi = Xj) D iji = yj. Therefore yi — yj. Since Ri(y) = 
(R(y) A yi = yj), Ri(y) holds. Therefore {y : [x]P} implies Ri(y). Therefore i?f (P(x)). 

Substitution rule, special case Suppose R L ([u, v,x]P(u, v,x)) and a is an integer function 
such that a does not identify data variables of x and such that a(i) — i for i ^ 2 and 
<r(2) = 1. Then Ra(u' ,v' ,x') — R(u' ,u' ,x'). It is necessary to show Ra L ([u, v, x]Pa(u, v, x)), 
that is, Ra L ([u,v,x]P(u,u,x)). Suppose {u',v',x' : [u, v ,x]P(u, u, x)}. It is necessary to 
show Ra(u' , v',x'). By the correspondence axiom, {u', u' ,x' : [u, u,x]P(u, u,x)} if a is output 
injective. By definitional independence, {y! , u' , x' : [u, v, x]P(u, v, x)} if u and v are not distinct 
data variables. Because R L ([u, v, x]P(u, v, x)), R(u', u',x'). Therefore Ro-(u', v',x'). Hence 
Ra L ([u,v,x]P(u,u,x)). Therefore Ra L ([u, v,x]Pa(u, v, x)). 
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Substitution rule, general case Suppose R L (\x]P(x)) and a is an integer permutation 
such that a does not identify data variables of x. Then Ra L (\x]Pa(x)) by combining permu- 
tation rule 1 and the variable renaming rule. Now, let a be an arbitrary integer function from 
{1, . . . , n} to {1, . . . , n} where x has n components. Then a can be expressed as the composition 
<7i<72 . . . <Jk where the at are permutations and functions as in the preceding special case. For 
each such ct;, Ro\a 2 ■ ■ ■ o'[_ 1 (\x]Paia 2 ■ ■ ■ 0i-i(a?)) implies Ro\<j 2 ■ ■ ■ ^f{\x]P<J\<J 2 ■ ■ ■ By 
combining all these implications, R L (\x]P(x)) implies Ra L (\x]Pcr(x)). 

Correspondence rule Suppose R L (\x])P. It is necessary to show R^([y]P). For this, 
suppose {y' : [y]P}. Let / and g be such that g(y) = y' and Vu e FV(P) U ([x}P) data {f{u) = 
g{u)). Then {g(y) : \y]P}. By the correspondence axiom, with / and g interchanged, {f(x) : 
[x]P}. Because R L ([x]P), R{J(x)). By the definition of Rt, Ri{g(y)), that is, Ri(y'). 

Definitional independence rule Suppose R L (\x]P(x)) and 9 is output injective on P. 
Suppose also that {z : \xQ]P(xQ)}. By the definitional independence axiom, if is a variable 
substitution that is output-injective on P and does not identify two data variables, then Vz({z : 
[xOjPixO)} D {z : [x]P(x)}). Therefore {z : \x]P{x)}. Since R L {\x]P{x)) 1 R(z) holds. Thus 
{z : [xQ]P(xQ)} implies R(z). Therefore R L ([x®]P(xO)). 

New variable rule Suppose R L (\x, ip]P), y is a new data variable for P, and R(u,v) A 
R(u,w) D R\(u,v,w). Suppose also that {u,v,w : \x, y, ip]P}. By the new variable axiom, 
{u, v : \x, ip]P} and {m, w : \x,ip]P}. Because R L (\x, tp]P), R(u,v) and R(u 7 w). By the above 
implication, R\ (u,v,w ). Therefore R^([x,y, ip]P). 

Composition rule _Suppose Hf(Pi(f)), R%(P 2 {zj), prec(<f>, P u P 2 , f, g), and Vfg(Ri(f (y)) A 
R 2 (g('z)) D R(cj)p 1 ,p 2 (f ,g)(y oz))) . Suppose also that {if oz 1 : Pi(y); P2(z)}. Then there must 
be a semantic function h such that {/i(joz) : P 1 (jj ) ; P 2 (z) } . By the sequential composition 
axiom 03 there are semantic functions / and g such that {f(y) : Pi(y)} and {g(z) : P2{~z)} 
and h = 0Pi.p 2 (/,5 l ) and prec((j), Pi, P 2 , f , g) ■ From Pf(Pi(y)) and R%(P2(z)) it follows that 
Ri(f(y)) and R 2 (g(z)). From Vjg(Ri(f(y)) AR 2 (g(z)) d R{dp Pu p 2 {f,g)(yoz))) it follows that 
R{<t>P 2 Mf,9)(y°z))) Therefore R(h(yoz))), so R(y' oz'). Therefore {y' oz' : P 1 (y): P 2 (z)} D 
R(y',z'). By axiomEl R L {P 1 {y); P 2 {z)). 

Conditional rule Suppose Rf(Pi(y)), R%{P 2 {zj), y C FV(P 1 ) 1 z C FV(P 2 ), and ~ix'y'z'(x'A 
Ri(V')) V (--a;' A R 2 (z')) D R(x',y',z'). Suppose also that {x',y',z' : xlPx{y)lP 2 {z)}. By the 
conditional axiom |23 (x' — true A {y' : \y]Pi{y)}) V (x' = false A {z 1 : [z]P2(^)}). From 
R[{ p i(v)) and R%(P 2 {z)) it follows that (x' = true A Ri{y')) V (x' = false A R 2 {z')). From 
Wy'z'(x' A Ri(y')) V (-.a;' A R 2 {z')) D R{x',y',z'). it follows that R{x',y',z'). Therefore 
{x',y',z' : xW 1 (y)W 2 (z)} D R(x',y',z'). By axiom|3 R L ([x,y,z]x?P 1 (y)?P 2 (z)). 
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Output deletion rule Suppose R L ([u, x, v]P) and Ri(u, v) = 3xR(u, x,v). Suppose {u', v' : 
[u, v]3xP}. By the deleting output axiom, 

Wv'(3x'{u',x',v' : [u,x,v]P} = {u' ,v' : [u,v]3xP}) (56) 

Therefore 3x'{u', x' , v' : [u,x,v]P}. Because R L ([u,x,v]P), it follows that 3x'R(u', x' , v'). 
Therefore by definition of Ri, Ri(u\v'). Because {u',v' : \u,v]3xP} implies Ri(u',v'), there- 
fore R^([u,v]3xP). 

Procedure rule Suppose R L (P(x,y)) and Ri(u,q) = Vv(q(v) D R(u,v)). It is necessary to 
show Vu3qRi(u, q) and Ri(\x,p] f^- P(x,y)). By the procedure operator axioms, 

W3q{u,q : [x,p] ^P(x,y)} (57) 

and 

Vpqu({u,q : [x,p] f~ P(x,y)} D W({u,v : [x,y]P(x,y)} = q(v))) (58) 
where p is a new variable. From equations 1571 and 1581 it follows that 

\/u3q\/v({u, v : [x, y]P(x, y)} = q(v)). (59) 

By the definition of R L (P(x,y)), it follows that {u,v : \x,y]P(x,y)} D R(u,v). From this and 
equation 1591 it follows that Vu3qVv(q(v) D R(u,v)), and by the definition of R\ this implies 
\fu3qRi(u,q). From equation 1581 and the fact that R L (P(x,y)) and by the definition of R\ it 
follows that 

Vpqu({u, q : [x, p] |§ P{x, y)} Di?i (u, q). (60) 
Therefore R£([x,p] ^P(x,y)). 

Application rule 2 Suppose q(v) D R(v,q). It is necessary to show R L (\y,p] j^.). By the 
second application axiom, Vvq({v, q : \y,p] ]£■} = q(v)). Because q(v) D R(v, q), Vvq({v,q : 
[y,p] ||} D R(v,q)). Therefore R L {[y lP ] ||). 

Least fixpoint rule Suppose 

Wx'uJyy'v'Rix', u , y' , i>'), R L (P(x, u, y, v)),x, u S P m ,y,v e P out (61) 

Vx'u'y'v'(R(x',u',y',v') D 3y"v"{x', u',y", v" : P(x,u,y,v)}) (62) 
Vx'y'w / (R 1 (x',y',w') = (R{x',w',y',w') A Vz'Vy"(i?(x', z', y", z') D z' > w 1 ))) (63) 
It is necessary to show R^(/j,(u, v, P(x, u,y,v),>)). Assume 

{x',y',w' : /j,(u,v,P,>)[x,y,v)}. (64) 

It is necessary to show R\(x' , y' , w'), that is, 

R(x ,w' \y ,w') A Vz'Vy" (R(x' , z ,y" , z') D z' > w'). (65) 
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By the least fixpoint axiom, if 



Vx u Sy'u {x , u , y , v : P(x,u, y,v)} 



(66) 



and y, v are outputs and x, u are inputs of P then 

{x',y',w' ; fi(u,v,P,>)(x,y,v)} = 

({x' >' ,y' >' : P(x,v,y,v)} A V z'Vy" {{x' , z' ,y" , z' : P(x,u,y,v)} ~D z' > w')).(67) 

Now, formulaEni follows from formula l6*T1 For if {x' , u', y', v' : P(x, u,y, v)} and \x' , u' ,y" , w" : 
P(a;, u, y, v)} then by formulal6*Tl R L (P(x, u,y, vj), so R(x' , u', y', w') and u',y", v") hold, 
and from the formula Vx'u'T,y'v'R(x' , u', y', v') it follows that y' = y" and u' = v". This 
proves formula l66l Then by the least hxpoint axiom, formula 1671 follows . From formula l67l and 
assumptionl64lit follows that {x',w',y',w' : P(x,v,y,v)}, and therefore from R L (P(x,u,y,v)) 
it follows that R(x' ,w' ,y' ,w'). To prove formula 16 51 it is also necessary to show 



Suppose R(x' , z' ,y" , z'). From formula l6*2l it follows that formula {x' , z', y'" , z" : P(x,u,y,v)} 
holds for some y'" and z", and therefore R(x',z',y"',z"). Since y'" and z" are unique by 
formula 16 II they are equal to y" and z' . Therefore {x 1 , z' ,y" , z' : P(x, u,y, v)}, and then from 
formula E3 it follows that z' > w' . This completes the proof. 



There is an algorithm to extract L programs from proofs in relational PL{L), as follows: 

Definition 5.1 The PL program operations are composition (;), conditional (?), variable deletion 
(3), procedure {]), application (J,), and least fixpoint (p). 

Definition 5.2 If L is a PL-feasible language, and Pi, P2, ■ ■ ■ , P n are L programs, then an L pro- 
gram term over P\ , Pi , . . . , P n is either 

1. one of the programs Pi, or 

2. of the form P; Q, x?P?Q, t& P> |^ P, 3xP, or fp(x, y, P, >) where P and Q are L program 
terms over Pi, P 2 , . . . , P n and x, y, p, andx are program variables, and where the preconditions 
for these operators are satisfied. 

Theorem 4 If L is a PL-feasible language and there is a proof of an assertion of the form R L (\y]P) 
from assertions of the form Rf(\jj l ]Pi) in relational PL(L), then P is expressible as an L program 
term over Pi©i, ^262, • ■ ■, PnQn for some output injective variable substitutions 0^. 

Proof: By induction on proof depth. For depth 0, P = Pi for some i, and Pi is trivially an L 
program term over Pi, . . . , P n . Assume the theorem is true for proofs of depth d. A proof of 
depth d + 1 consists of one or two proofs of depth d followed by the application of an inference 
rule. By induction, the theorem is true for the proof or proofs of depth d. Then, using the 
forms of the inference rules, the theorem is also true for the proof of depth d+ 1. 



Vz'\fy"(R(x',z',y",z') D z' > w'). 



(68) 
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It is necessary to look at each inference rule. For the underlying logic rule, P is not altered, so 
the induction step holds. For permutation rules 1 and 2, only the variables of P are renamed, 
and renaming variables in an L program term over P\Q\, . . . , P n Q n yields another L program 
term over PiQ[, . . . , PnQ' n for suitable 0^. The equality rule does not alter P. The substitution 
rule applies an integer function a to the variables of P. This can be incorporated into the Oi 
as well, but it is necessary to check that identifying variables of the Pi does not invalidate any 
inference rules used to obtain P. The correspondence rule does not affect P, only its preceding 
list of variables. The definitional independence rule is similar to the substitution rule in its 
effect on P. The remaining rules (composition rule, conditional rule, output deletion rule, 
procedure rule, application rule, and least fixpoint rule) all produce L program terms from L 
program terms. 

Now, the PL program operations of composition, conditional, output deletion, f , j, and \i 
have some preconditions, and it is necessary to check that these preconditions still hold in the 
resulting L program term after applying the substitution rule and the definitional independence 
rule. The sequential composition operator P; Q requires that no variable be in P out n Q out 
and that no data variable be in P m n Q out . Assuming the former condition is true when the 
; operator is applied, it will remain true because all substitutions are output injective. The 
latter condition on data variables will remain true because no substitution identifies two data 
variables unless both are inputs. The procedure operator axiom for ] p P requires that p be a 
new variable or an input variable for P. Now, p is an output variable of ] p P, which implies that 
no substitution will identify p with any other output variable (because substitutions are output 
injective). Thus any substitution O will only identify P with input variables, so p will still be 
an input variable in P6 and the preconditions for this rule will still hold. The preconditions for 
the application axiom are similar to those for composition, and similar reasoning applies. The 
preconditions for n(u, v, P, >) state that u is an input and v is an output to P. Also, u is not a 
free variable of /j,(u, v, P, >) and v is an output. Because of the rules for applying substitutions, 
u and v will remain distinct, and u will remain an input variable after substitutions are applied. 
v will remain an output variable as well, because the common image of an input and an output 
variable is an output variable, so the preconditions for fi will continue to hold. 



Corollary 1 If in addition the PL program operations composition, conditional, 3, f, I, and /i are 
effectively computable in L, in the sense that an L program for P; Q can effectively be obtained from 
L programs for P and Q, et cetera, then an L program P such that R L (\y]P) is effectively computable 
from the proof, given L programs for Pi, P2, . . ., P n - 

6 Abstract Inference Rules 

The preceding inference rules permit proofs of properties of programs in a specific programming 
language L. It is possible to modify these rules to obtain the system PL* that permits abstract 
proofs of the existence of programs, but not in a specific language. Such proofs can then can 
be translated into programs in specific PL-feasible programming languages automatically. These 
abstract rules involve assertions of the form (3P)R L (P) where P is a variable representing a program 
and R is a relation on programs and L is a variable representing a Pi-feasible language. The form 
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of the proof not only guarantees that such a program P exists, but also permits a specific program 
to be derived from the proof, as in other program generation systems. It is necessary to record the 
list of input and output variables for each program variable P in order to use these inference rules; 
rules appearing in section [S] suffice to compute these lists for variables P appearing in the conclusion 
of each rule. 

(3P)Rf(P),R 1 D R 2 A1 , , . , . 

- Abstract underlying logic rule 



(3P)P^(P) 
(3P)R%(P),k arbitrary 



(3P)VfcP£(P) 

Rules about variables 

(3P)R L ([x]P(x)), 9 is a variable renaming 



Abstract universal quantification rule 



(3P)R L ([xQ]P(xQ)) 

(3P)R '^(\x]P(x)), a is an integer permutation 
{3P)R<J L ^{[a-\x)]Pm 

(3P)R L ([x]P(x)) 7 a is an integer permutation 



Abstract variable renaming rule 



Abstract permutation rule 1 



(3P)Ra\[x]P(a(x))) 
(3P)R L (P(x)), Xl = x 1 ,R 1 {y) = (R(y) Ay t = y 3 ) 



Abstract permutation rule 2 



(3P)R-(P(x)) 

(3P)R L (\x]P(x)), a is an integer function such that 
a 7 does not identify data variables of x 



Abstract equality rule 



Abstract substitution rule 



(3P)Ra L {[x]P(a{x)) 

(3P)R L ([x]P),Vf,g(R(7(x)) A Vu e FV(P) U ([x]P) data (f (u) = g(u))) D Ri(g(V)) Abstract 

(3P)Rf(\y]P) correspondence rule 

(3P)R L ([x]P(x)),Q output injective on P 

and does not identify data variables of P 

Abstract definitional independence rule 



(3P)R^([xQ]P(x&)) 
(3P)R L '^(\x, tp]P), y a new data variable, R(u, v) A R(u, w) D Ri(u, v, w) 



(3P)R^([x,y^]P) 



Abstract new variable rule 
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Rules about program operations 

(3P_0Pf (P^y)), {3P 2 )R%{P 2 {z)), 
precj^ P, Q, /, fl ), V/g(fli (/(g)) A P 2 (g(z)) D R(0p,Q (/, fl) ° ?))) 

(3P)R L (P(y,z)) 

(3P 1 )R^(P 1 (y)), (3P 2 )R^(P 2 (z)),y C F^),2C PV^(P 2 ), 
Wy'z'{x' A Pi (I/')) V (-.ar 7 A R 2 {z')) D R{x',y',z') 
(3P)R L (P(x 1 y,z)) 

(3P)R L ([u,x,V]P),R 1 (u,v) = 3xR{u,x,v) 



Abstract composition rule 



Abstract conditional rule 



(3P)Pf(P) 

(3P)R L (P(x,y)),R 1 (x , ,g) = VgW) D P(x',yQ) 
Vx , 3qR 1 (w',q) A (3P)Pf(P) 

q(iJ) D R(v, q) 



Abstract output deletion rule 



Abstract procedure rule 



(3P)R^([y,p]P) 



Abstract application rule 



Vx'u'T,y'v'R(x',u',y',v'), 
(3P)(R L (P{x,u,y,v)),x,u G P m ,y,v e P°"*, 
AVxWt/(P(z>',y>') D 3yV'{x>', y'>" : P(x, u, y, «)})), 
Mx'w'y'jRijx'.y'.w 1 ) = (P(j' ; w', y', w') A Vz(P(s', z', y', z') D z' > L u-'))) Abstract least 

(3P)Pf (P) fixpoint rule 

It is possible to translate proofs in PL* into programs in am/ PL- feasible language M: 

Theorem 5 There is an algorithm which, given a PL* proof of an assertion of the form 3PP L ([y]P) 
from assertions of the form EP^Pf ([y l ]Pj) (where L is a variable representing a PL-feasible lan- 
guage), and given a PL-feasible language M and M programs Pi such that P Af ([y i ]Pj), produces an 
M -program P such that R M (\y]P). 

Proof: It is straightforward to translate PL* proofs into PL(L) proofs, for any PL-feasible language 
L, and then apply the algorithm of corollary ^ 

7 Abstract programs 

Corresponding to abstract inference rules there are abstract programs in PL. 
Definition 7.1 An abstract PL program is either 

1. A variable X, representing a program fragment, or 
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2. Of the form P; Q, xlPlQ, % P, || P, 3xP, or fp(x, y, P, >) where P and Q are abstract PL 
programs and x, y, p, and x are program variables. 

L* is the set of abstract PL programs. The notation P[X\,X2, ■ ■ ■ , X n ] refers to an abstract PL 
program, where P is a composition of the PL program operations ; , ?, f, |, 3, and fp and X\, . . . , X n 
is a listing of all the variables in P representing program fragments. By contrast, P(x) represents 
the program P mentioning the program variables x. If Pi, P%, ■ . ■ , P n are L programs for some PL- 
feasible language L, then P[P\,P2, ■ ■ ■ , P n ] denotes the L program term that results from replacing 
all occurrences of Xi in P[X\, . . . ,X n ] by Pi. The program variables x can also be indicated as in 
P[P 1 (x),P 2 (x),...,P n (x)}. 

By theorem|51 a PL* proof can be converted to a PL(L) proof, for any PL-feasible L, and from 
the PL(L) proof an L program term can be obtained. This L program term is much like an abstract 
PL program, but it contains substitutions on the programs P;. It is possible to eliminate these 
substitutions and also the dependence on the particular proof system PL(L) or Pi*, as follows. 

Definition 7.2 If P(x) is a PL(L) -program having x as free variables, then [[P(x)]\, the semantics 
of P(x), is the set {/ : {f(x) : P(x)}}, where f is a function from variables to their semantics. 

Theorem 6 If P and Q are L programs, then [[P; Q]] is a function of [[P]] and [[Q]], [[3xP]] is a 
function of [[P]], [[T= P]] is a function of [[P]], and similarly for the other PL program operations. 

Proof: By consideration of the definition of each operation, noting that the semantics of the oper- 
ations depend only on the semantics of the operands. 

Definition 7.3 Extend the PL operations on programs to operations on their semantics, so that 
[[P; Q]] = [[P]]; [[Q]], [[3xP]\ = 3x[[P]], and so on, thus giving names to the functions in theorem^ 

Theorem 7 For every L program P[Pi, . . . ,P n ] where P is an abstract PL program (composed of 
the PL program operations) and Pi are variables representing L programs, there is a function fp 
depending on P but not on Pi , . . . , P„ such that 

/^([[P 1 ]],...,[[P„]]) = [[P[P 1 ,...,P„]]] (69) 

for all L programs P\ , . . . , P n . 

Proof: By induction on the depth of P, using Theorem [5] 

Theorem 8 For every abstract L program P[X\, . . . ,X n ] where P is an abstract PL program (com- 
posed of the PL program operations ) and Xi are variables representing program fragments, there is 
a function fp depending on P but not on X±, . . . , X n such that 

fp([[X 1 ]},...,[[X n }}) = [[P[X 1 ,...,X n }}} (70) 

for all X\i ■ ■ ■ , X„ . 

Proof: By induction on the depth of P, using Theorem and Definition 17.31 
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Theorem 9 For any PL feasible language L and any L program P[Pi, ■ ■ ■ , P n ] where P is an 
abstract PL program, the abstract program P[Xi, . . . ,X n ] satisfies fp = fp. 

Proof: M[Pi]], [[Pn]}) = [[P[Pi, Pn}}} = f P ([[Pi]}, ■ [[Pn}}) by Theorems Hand El 

Corollary 2 Suppose that P[X\, . . . , X n } is an abstract PL program and A±, . . . , A n , A are asser- 
tions such that for all program fragments X\, . . . , X n , Ai([[Xi}\)/\. . .AA n ([[X n }}) D j4([[PLYi, . . . , X n ]]]). 
Then for any PL feasible language L and any L programs Pi, . . . , P n of appropriate sorts, Ai([[Pi}}) A 
..^A n ([[P n ]])z>A([[P(P 1 ,...,P n )]]). 

Proof: From the hypothesis Ai([[Xi]]) A ... A A n ([[Jf„]]) D A{[[P[X X , X n }]]) and TheoremElit 
follows that Ai([[Xi]])A. . .AA n ([[X n }}) D Aifp^X^}, [[X n ]})). From TheoremElit follows 
that Ai([[Xi]]) A ... A A n ([[X n ]}) D . . . , [[X n ]})). From Theoremdit follows that 

A 1 ([[P 1 }}) A ... A A n ([[P n }}) D A([[P[P U P n }}]). 

This yields the following method for constructing L programs satisfying a specification: 

1. Construct an abstract PL program P[X\, . . . ,X n }. 

2. Show that P satisfies the specification Ai([pfi]]) A ... A A„([[X„]]) D A([[P[Xi, . . . ,X n }]]). 

3. Choose L programs P%,...,P n . 

4. Show that these programs satisfy Ai([[Pi]]), . . . , A„([[P„]]). 

5. Conclude that the L program P[P\, . . . , P n ) satisfies the specification A([[P[Pi, . . . , P n ]]]). 

This shows that one can construct abstract PL programs satisfying a specification, and from 
them one can construct L programs satisfying the specification, for any PL-feasible language L. L* 
programs are somewhat similar to "pseudocode" descriptions of algorithms found in textbooks, but 
unlike pseudocode, L* programs have a formal syntax and semantics, which permit programs to be 
verified. It would of course be possible to verify a program P in some particular language such as 
C and translate P to other languages L. Why is L* any better for this purpose? The syntax and 
semantics of L* are simple, making it easier to write such a translator and the translator is more 
likely to produce efficient code in L. It is also easier to verify L* programs than C programs. 

Another possibility would be to verify a program in lambda calculs or /i calculus or some other 
language with a simple syntax and semantics and translate this program into other languages. An 
advantage of L* is that it has features to guide the translation, such as the distinction between 
procedures and data, the use of ; to signify sequential composition, the use of 3 to signify variable 
declarations, the use of conditionals, and so on. This means that in the abstract program one can 
give guidance about how the algorithm should be expressed to gain efficiency. 

In fact, an abstract program can be considered as a way to formally describe algorithms. A 
description of an algorithm in a particular programming language gives extraneous details related 
to the programming language syntax but not to the algorithm. A pseudocode description of al- 
gorithms as found in textbooks does not have a precise syntax and semantics. Turing machine 
descriptions also contain extraneous details and lack abstraction and do not capture the efficiency 
of data structures. Pure functional languages without destructive assignments do not permit an im- 
perative programming style, which can lead to inefficiency. More abstract notations such as lambda 
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calculus and /i calculus give too little guidance concerning efficient code generation, which generally 
requires destructive modification of data structures, side effects, and conditional statements, and are 
difficult to translate efficiently into more conventional programming languages. Thus PL is abstract 
enough to avoid extraneous details about syntax but not too abstract to express program features 
that have a major influence on efficiency. 

The emphasis of PL is not so much the automatic construction of programs or even automatic 
proofs of their correctness, but rather the ability to write abstract proofs or programs that can be 
translated into a wide variety of other languages, to avoid the necessity of writing the same program 
over and over again in different languages. Probably it would be most efficient for the abstract 
programs to be coded by humans and stored in a library. It does not appear feasible to construct 
complex programs by automatic program generation methods in most cases. The PL approach 
permits a reduction of programmer effort even in the absence of automatic program generation. The 
system PL can even be used without formal proofs of correctness; the programs Pi can be verified to 
satisfy the assertions Ai, or this can just be checked by testing, to gain some measure of reliability 
without a formal proof. In fact, it is not even necessary to know that the language L is PL-feasible; 
this can be verified in a large number of cases by testing, to gain some confidence in the reliability 
of the programs. 

Abstract programs may be parameterized. For example, the precision of floating point operations 
may be a parameter. If this precision is too high, then the abstract program may not translate into 
as many languages. Another example of a parameter might be the length of character strings. If 
different languages implement different length character strings, then depending on the values of 
this parameter, the abstract program would translate into a different set of languages. However, if 
the abstract program is correct regardless of the parameter values, then the L programs resulting 
from it will also be correct for all values of the parameters. 

The abstract programs are not necessarily easy to read or understand, although readability is 
easier for the textual syntax. Here is an abstract program for factorial: 

A*(/j .9; Tfi.ti (^twxyz l t ; [ ntw ] w7 \, v l | x ; l nxy ', ly Z ', i nzv ), •>) 

Another approach is to allow the defined symbol g to be one (/) that already appears, yielding the 
following simpler program: 

tn,v {^twxyz l t , l n f w , W? l v ? l x , l nxy , iy Z i inzv) 

This program has the following PL textual syntax: 

proc f(n, v); 
var t, w, x, y, z; 
call 0(i); 
call = (n, t, w); 

if w then call l(v) else call l(x); call — (n, x, y); call f(y,z); call *{n,z,v) fi 
end /; 

Here f(n,v) is a procedure with input n and output v, 0(t) sets t to zero, = (n,t,w) sets w to 
true if n = t, false otherwise, l(v) sets v to one, — (n, x, y) sets y to n — x, and *(n, z, v) sets v to 
n * z. If w is true then v is set to 1 else x is set to 1, y is set to n — x, f(y, z) is called, and v is set 
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to n * z. The n operator is not necessary in this case. In fact, for many PL feasible L, it is never 
necessary to use /x, because /j,(u, v, P[u, v, x], >) = P[v,v,x]. 

The abstract programs could be made more abstract in various ways, such as making them 
polymorphic. 

As an example of the use of data, consider the following program to update all elements of an 
array of length n: 

1 proc Update(A,n) 

2 call Updatel (A,n); 

3 if n > 1 then call Update(^4,n — 1) fi; 

4 end Update; 

Here Updatel (^4,n) returns array A' m with the n th element updated. The variables A and n 
are data variables and Update and Updatel arc procedures. Without the use of data variables, 
one would have to recopy the whole array to update each element. The fact that the parameter A 
to Updatel is both an input and an output of the procedure avoids this inefficiency. 

In A* notation, without syntactic sugar, the above program would be 

^.Update 1 1 Updatel . q / I 1 . I - . |= . ~7 I Update<?\\ 
I An \-*-An ' -'• L y^\-i-xi i-nxy> i-yOzi " • i-Ay •)) 

One can obtain the effect of global variables as data variables that are inputs to several proce- 
dures, as follows: 

where u G P rn n Q rn and u is a data variable. Input and output are essentially global data variables 
representing files, and read and write statements modify these variables. 

One can obtain iterative loops by recursion, or as special procedures that are known to the 
compiler and that permit compilation by iteration instead of recursion. They can also be added as 
operators to PL, much as the conditional operator was added. 

8 An example: Quicksort 

This section presents an example program, quicksort, as an abstract program with an associated 
proof of correctness. A sketch of a translation of the abstract program into a quicksort program in C 
follows. This example illustrates the proof rules as well as the importance of destructive modification 
of data for program efficiency. It is not necessary to consider side effects because no operations in 
the quicksort program have side effects. 

The textual syntax for the Quicksort program, with some simplifications, is the following: 
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1 proc Quicksort (A,p,r) 

2 var q; 

3 if p < r 

4 then Partition(A,p, r, q); 

5 Quicksort (A, p, q) ; 

6 Quicksort ( A, q + 1, r) 

7 fi 

8 end Quicksort 

This corresponds to the abstract program 

T Q lnTii(\ lt ■T r >(\ p • \® ■ \ inc - \Q \\ 
I Apr ^^yy^-prx' ^ ' \-*-Aprqi * Apq> ^qy ' *Ayr>) 

where Q is Quicksort, P is Partition, lt(p,r,x) sets x to true if p < r, false otherwise, and 
inc(q,y) sets y to q + 1. Define abstract programs Pi,P2,P3,Pt, P5, and P 6 , respectively, as follows: 

Pi(Q,Ap,<7) 

P2(P, A,p,q,r) =iA prq 
P 3 (P, Q, A, p, q, r) = P 2 (P, A, p, q, r); Pi (Q, A, p, g) 
P 4 (P, Q, A, p, q, r, y) = P 3 (P, Q, A, p, q, r); |™ c ; |%, r 
P 5 (P, Q, A, p, q, r, x, y) =i l p t rx ;x?P 4 (P, Q, A, p, q, r, y) 

P«(P, Q) =1% r 3gxyP 5 (P, Q, q, r, x, y) 

If L is the language C, then P\(Q,A,p,q) might correspond to the statement "Q(A,p, q);" , 
P2(P, A,p,q,r) might correspond to the statement u P(A,p,q,r);" , Ps(P,Q,A,p,q,r) might corre- 
spond to the sequence a P(A,p, q, r); Q(A,p, q)f of two procedure calls, et cetera. The program 
fragment 3qxy would correspond to the declarations "hit q,x,y" in this case, assuming q, x, and y 
have integer sorts, but could correspond to the statement "float q, x, yf if q, x, and y had real num- 
ber (floating point) sorts. The C program for Pe(P, Q) would be something like "void Q(A,p, r) int 
r; int q, x, y; { P$(P, Q, A,p, q, r, x, y) }" where L is C. The C program for Pq(P, R) would be 
"void R(A,p, r) int ^4[],p, r; int q, x, y; { P 5 L (P, R, A,p, q, r, x, y) }" , showing how program variables 
(names of procedures or data variables) in program text can be replaced. Quicksort programs in 
other languages besides C could be generated in a similar manner. 

To give a proof of correctness, define perm(A, B) for two arrays A and B to specify that the 
elements of B are a permutation of the elements of A, and define relations as follows: 

R perm (A) =perm(A init ,Af in ) 

R bdry (x,y) = (x init = x f' n A y mU = y* in ) 
R S pUt(A,p, q, r) = Vij{{p init < 1 < qf' m A qf' m < 3 < r init ) D < A* in \j\) 

R part {A,p, q, r) = R perm (A) A R bdry {p, r) A (p mU < q fin ) A (q fm < r lmt ) A R spUt (A,p, q, r) 
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R sort (A,x,y) = \fij(x <i<j<yD A fin [i] < Af in \j\) 
For convenience, it helps to identify program variables with their semantics when defining rela- 
tions. R L (P(x)) is defined to mean Vy({y : P(x)} D R(y))- This can also be written as / e [[P(^)]] D 
R{f(xi) . . . f(x n )). Defining R f (x 1: ...,x n ) as P(.f(xi), . . . , f(x n )), one has / G [[P(#)]] D Rf(x). 
It is more convenient to give the relations Rf than R because Rf mentions the program variables 
x rather than their semantics y. It is these relations Rf rather than R that follow. For example, 
in order to express that Vyiy2({yi, 2/2 : P{xi,x 2 )} ^ 2/2 = 2/1 + 1)j 0110 would ordinarily define 
R(yi,y2) = (2/2 = JJi + 1), but identifying program variables £1,212 with their semantics 2/1,2/2 one 
specifies P(xi,a; 2 ) = (x2 = x\ + 1), which is more intuitive because x\ and X2 appear in P. 

Recall that data variable x in a program fragment P has semantics (a, (3) where a is the initial 
value of x and (3 is the final value. By convention, (a, f3) tmt — a and (a, /3)^ m = [3. If one identifies 
variables with their semantics, a — x mlt and j3 = x i%n . 

Define relations Pi, P 2 , P3, P4, and P 5 to be satisfied by the programs Pi,P 2 ,P 3 ,P 4 , and P 5 , 
respectively, and relation R qs as follows: 

Rl(A, P , q) EE Rperm(A) A R bd ry(p, q) A R SO rt (A, p mU , q mtt ) 

P 2 (A,p,<7,r) = ((p < r) D Rp art (A,p,q,r)) 
R 3 (A,p,q,r) = ((p < r) D P part (A,p, g , r) A P sort (A,p mlt , 
P 4 (A,p, g ,r) = ((p < r) D J2i(A,p,r)) 
J2B(AP,g,r) = Ri(A,p,r) 
R k qs (Q)^VA'p'q'(( q ' -p') <kAQ(A',p',q') D Pi {A' ,p' ,q')) 
R qs (Q)^VA>p> q >(Q(A>,p',q') D R 1 (A',p',q')) 

Rqs(Q) is the final specification for the quicksort program. In order to prove correctness, it is 
necessary to show that for all k, R qs (Q) implies R k f 1 (Q) and use induction. For this purpose, define 
relations R k specifying a bound on the sizes of the modified subarrays, as follows: 

R k (A,p, q, Q) ee (R k s (Q) A (q - p) < k D R,(A,p, q)) 

R k (A,p,q,r) ee ((r-p) < k + l D R 2 (A,p,q,r)) 
R k (A,p,q,r,Q) ee (R k qs (Q)A(r-p) < k + l D Ri(A,p,q,r)),i = 3,4 
R k 5 (A,p, q, r, Q) ee ((p > r) D A init = A^ n ) A (P* S (Q) A (r - p) < fc + 1 D P 5 (A,p, g, r)) 

The formula R k ' L ([A,p,q,Q]Pi) follows directly from the application axiom, because Q(A,p, q) 
holds, essentially, and because R qs (Q) holds. One then shows that if P 2 ([A,p, <j, r]P 2 ), then P 3 ' 

([A,p, q, r, Q]p3), R k ' L ([A,p, q, r, Q]Pt), and R k ' L ([A,p, q, r, Q]P$). (For convenience the variables x, 
y, and P (Partition) are omitted.) Because k is arbitrary, one has (VfcPf ) L ([A,p, q 7 r, Q]P^) by the 
universal quantification rule, and VfcPg (A,p, q, r, Q) = ((p > r) D A imt = A* in ) A Vk(R k s (Q) A (r — 
p) < k + 1 D Ps(A,p, g, r)). From this, using the procedure rule, one derives R qs (Q) A Vk(R qs (Q) D 
P^+ 1 (Q)) for the program [Q]P6- Using the underlying logic rule and mathematical induction, 
R qs (Q) follows. 

The final program does not include code for Partition. Any verified program for Partition 
can be inserted and will give a correct quicksort program. Thus the final verified code has some 
flexibility. 
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